Overview

Dataset Statistics

Number of Variables 8
Number of Rows 1.3605e+07
Missing Cells 5810
Missing Cells (%) 0.0%
Duplicate Rows 0
Duplicate Rows (%) 0.0%
Total Size in Memory 830.4 MB
Average Row Size in Memory 64.0 B
Variable Types
  • Numerical: 8

Dataset Insights

DAYS_INSTALMENT and DAYS_ENTRY_PAYMENT have similar distributions Similar Distribution
NUM_INSTALMENT_VERSION is skewed Skewed
NUM_INSTALMENT_NUMBER is skewed Skewed
AMT_INSTALMENT is skewed Skewed
AMT_PAYMENT is skewed Skewed
DAYS_INSTALMENT has 13605401 (100.0%) negatives Negatives
DAYS_ENTRY_PAYMENT has 13602496 (99.98%) negatives Negatives
NUM_INSTALMENT_VERSION has 4082498 (30.01%) zeros Zeros

Variables


SK_ID_PREV

numerical

Approximate Distinct Count 997752
Approximate Unique (%) 7.3%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 217686416
Mean 1.9034e+06
Minimum 1000001
Maximum 2843499
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • SK_ID_PREV is skewed right (γ1 = 0.0425)

Quantile Statistics

Minimum 1000001
5-th Percentile 1.0841e+06
Q1 1.436e+06
Median 1.9e+06
Q3 2.3715e+06
95-th Percentile 2.7535e+06
Maximum 2843499
Range 1843498
IQR 935544

Descriptive Statistics

Mean 1.9034e+06
Standard Deviation 536202.9055
Variance 2.8751e+11
Sum 2.5896e+13
Skewness 0.04251
Kurtosis -1.2171
Coefficient of Variation 0.2817

SK_ID_CURR

numerical

Approximate Distinct Count 339587
Approximate Unique (%) 2.5%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 217686416
Mean 278444.8817
Minimum 100001
Maximum 456255
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • SK_ID_CURR is skewed left (γ1 = -0.0034)

Quantile Statistics

Minimum 100001
5-th Percentile 118041
Q1 189184
Median 278412
Q3 367456
95-th Percentile 438111
Maximum 456255
Range 356254
IQR 178272

Descriptive Statistics

Mean 278444.8817
Standard Deviation 102718.3104
Variance 1.0551e+10
Sum 3.7884e+12
Skewness -0.003354
Kurtosis -1.197
Coefficient of Variation 0.3689
  • SK_ID_CURR is not normally distributed (p-value 5.149412038123047e-07)

NUM_INSTALMENT_VERSION

numerical

Approximate Distinct Count 65
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 217686416
Mean 0.8566
Minimum 0
Maximum 178
Zeros 4082498
Zeros (%) 30.0%
Negatives 0
Negatives (%) 0.0%
  • NUM_INSTALMENT_VERSION is skewed right (γ1 = 9.5934)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 0
Median 1
Q3 1
95-th Percentile 2
Maximum 178
Range 178
IQR 1

Descriptive Statistics

Mean 0.8566
Standard Deviation 1.0352
Variance 1.0717
Sum 1.1655e+07
Skewness 9.5934
Kurtosis 259.607
Coefficient of Variation 1.2085
  • NUM_INSTALMENT_VERSION is not normally distributed (p-value 4.251444493823654e-25)
  • NUM_INSTALMENT_VERSION has 417616 outliers

NUM_INSTALMENT_NUMBER

numerical

Approximate Distinct Count 277
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 217686416
Mean 18.8709
Minimum 1
Maximum 277
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • NUM_INSTALMENT_NUMBER is skewed right (γ1 = 2.4976)

Quantile Statistics

Minimum 1
5-th Percentile 1
Q1 4
Median 8
Q3 19
95-th Percentile 83
Maximum 277
Range 276
IQR 15

Descriptive Statistics

Mean 18.8709
Standard Deviation 26.6641
Variance 710.9725
Sum 2.5675e+08
Skewness 2.4976
Kurtosis 6.7051
Coefficient of Variation 1.413
  • NUM_INSTALMENT_NUMBER is not normally distributed (p-value 1.405496806081297e-19)
  • NUM_INSTALMENT_NUMBER has 1886320 outliers

DAYS_INSTALMENT

numerical

Approximate Distinct Count 2922
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 217686416
Mean -1042.27
Minimum -2922
Maximum -1
Zeros 0
Zeros (%) 0.0%
Negatives 13605401
Negatives (%) 100.0%
  • DAYS_INSTALMENT is skewed left (γ1 = -0.6287)

Quantile Statistics

Minimum -2922
5-th Percentile -2550
Q1 -1649
Median -813
Q3 -359
95-th Percentile -81
Maximum -1
Range 2921
IQR 1290

Descriptive Statistics

Mean -1042.27
Standard Deviation 800.9463
Variance 641514.9503
Sum -1.4181e+10
Skewness -0.6287
Kurtosis -0.7987
Coefficient of Variation -0.7685

DAYS_ENTRY_PAYMENT

numerical

Approximate Distinct Count 3039
Approximate Unique (%) 0.0%
Missing 2905
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 217639936
Mean -1051.1137
Minimum -4921
Maximum -1
Zeros 0
Zeros (%) 0.0%
Negatives 13602496
Negatives (%) 100.0%
  • DAYS_ENTRY_PAYMENT is skewed left (γ1 = -0.6269)

Quantile Statistics

Minimum -4921
5-th Percentile -2558
Q1 -1658
Median -822
Q3 -368
95-th Percentile -89
Maximum -1
Range 4920
IQR 1290

Descriptive Statistics

Mean -1051.1137
Standard Deviation 800.5859
Variance 640937.7565
Sum -1.4298e+10
Skewness -0.6269
Kurtosis -0.8018
Coefficient of Variation -0.7617
  • DAYS_ENTRY_PAYMENT has 1 outliers

AMT_INSTALMENT

numerical

Approximate Distinct Count 902539
Approximate Unique (%) 6.6%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 217686416
Mean 17050.907
Minimum 0
Maximum 3.7715e+06
Zeros 290
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • AMT_INSTALMENT is skewed right (γ1 = 16.2359)

Quantile Statistics

Minimum 0
5-th Percentile 197.91
Q1 4241.3738
Median 8948.07
Q3 16762.905
95-th Percentile 47243.79
Maximum 3.7715e+06
Range 3.7715e+06
IQR 12521.5312

Descriptive Statistics

Mean 17050.907
Standard Deviation 50570.2544
Variance 2.5574e+09
Sum 2.3198e+11
Skewness 16.2359
Kurtosis 388.8392
Coefficient of Variation 2.9658
  • AMT_INSTALMENT is not normally distributed (p-value 4.25527358370024e-25)
  • AMT_INSTALMENT has 1116341 outliers

AMT_PAYMENT

numerical

Approximate Distinct Count 944235
Approximate Unique (%) 6.9%
Missing 2905
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 217639936
Mean 17238.2232
Minimum 0
Maximum 3.7715e+06
Zeros 1440
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • AMT_PAYMENT is skewed right (γ1 = 14.9519)

Quantile Statistics

Minimum 0
5-th Percentile 116.055
Q1 3426.93
Median 8170.56
Q3 16173
95-th Percentile 47952
Maximum 3.7715e+06
Range 3.7715e+06
IQR 12746.07

Descriptive Statistics

Mean 17238.2232
Standard Deviation 54735.784
Variance 2.996e+09
Sum 2.3448e+11
Skewness 14.9519
Kurtosis 324.5958
Coefficient of Variation 3.1753
  • AMT_PAYMENT is not normally distributed (p-value 4.2620694326307305e-25)
  • AMT_PAYMENT has 1125717 outliers

Interactions

Correlations

Missing Values